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(57) ABSTRACT 

A system and method for increasing capacity utilization of 
non-volatile storage devices within a group of non-volatile 
storage devices used to store data from at least one attached 
system are disclosed. A group of data sets is written to a first 
storage device as part of a write operation such as migration. 
A plurality of storage devices partially filled with data are 
designated as substitutes. The write operation to the first 
storage device is suspended upon receiving a request to read 
a data set stored in the first storage device, such as occurs in 
a recall operation. A second storage device is then selected 
from the plurality of substitute storage devices. The write 
operation is continued by writing data sets from the group of 
data sets included in the write operation that were not written 
to the first storage device to the selected second storage 
device. The requested data is then read from the first storage 
device. After data has been read from the first storage device, 
the first storage device may be designated as a substitute 
storage device so that the partially filled first storage device 
may be selected for continuing write operations. Data sets 
from substitute storage devices may be transferred or 
merged into a lesser number of storage devices during 
recycle operations to prevent the number of substitute stor- 
age devices from increasing beyond a predetermined limit or 
goal. Recycling operations in which data sets from different 
storage devices are transferred or merged may be performed 
by building a first queue including a list of filled tapes 
ordered according to the least amount of valid data and a 
second queue including all unassociated partially filled 
storage devices ordered by the amount of available storage 
space, and merging. The non -volatile storage devices may 
include tapes or tape libraries. 

23 Claims, 5 Drawing Sheets 
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STORAGE MANAGEMENT SYSTEM AND 
METHOD FOR INCREASING CAPACITY 
UTILIZATION OF NONVOLATILE STORAGE 
DEVICES USING PARTIALLY FILLED 
SUBSTITUTE STORAGE DEVICES FOR 
CONTINUING WRITE OPERATIONS 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to a system for improving 
capacity utilization of storage devices. 

2. Description of the Related Art 

An explosion of computer data and information, e.g., 
video, sound, pictures, etc., requires an ever increasing 
amount of computer readable storage space. Increasing data 
storage capacity requires improved storage management 
systems to backup and protect data sets, and migrate less 
active data sets to secondary storage to increase primary 
storage space. A data set consists of any collection or 
grouping of data. In certain systems, a data set may include 
control information used by the system to manage the data. 
The terms data set and file are generally equivalent and 
sometimes are used interchangeably. Hierarchical storage 
management (HSM) programs manage storage devices, such 
as tape libraries, to control the flow of data between primary 
and secondary storage facilities. Two important HSM pro- 
cedures are migration and recall. The migration procedure 
transfers the least frequently used data sets from primary to 
secondary tape storage. If a user wants to access migrated 
data sets, then the recall procedure retrieves migrated data 
from the secondary storage, e.g., tape, to the primary storage 
device, such as a local hard drive or group of direct access 
storage devices (DASDs). Currently, magnetic tape is the 
preferred media for backups and secondary storage. In the 
future, optical and holographic storage devices may supplant 
magnetic tape. 

Users often want data immediately. Such an immediate 
need for data creates a conflict if a user wants access to a 
data set that is located on a tape currently involved in a 
migration or backup operation. To provide immediate access 
to such tape, the HSM system would have to interrupt the 
migration to allow the user to recall the data set from the 
tape. Otherwise, the user would have to wait until migration 
completed. Wait times can be considerable given that current 
magnetic tapes can take several hours to fill entirely. 

If the migration procedure is interrupted to allow the user 
to recall data, then the tape will be taken away from 
migration for read operations. As a result, the recalled tape 
is likely only partially filled with data. As tape size increases, 
so does the likelihood that a user will need to recall a data 
set on a tape involved in a migration procedure. Increasing 
the number of interruptions to migration operations to 
service recall requests increases the number of partially 
filled tapes as discussed below. 

This tape capacity utilization problem is exaggerated 
because during migration/recall type operations, partially 
filled tapes cannot be used to complete a migration after a 
tape has been filled prior to completely migrating a data set. 
For instance, when a tape is filled in the middle of migrating 
a data set, only a blank tape can be used to store the 
remainder of the migrated data set. A partially filled tape 
includes no marker indicating empty portions of the tape. 
Thus, a data set cannot be completed in the middle of a 
partially filled tape because there is no marker to indicate 
where in the partially filled tape the remainder of the data set 
is placed. Using an empty tape to complete the migration of 
a data set creates another partially filled tape. 
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Tape data set stacking software seeks to improve tape 
capacity utilization. Some products involve hardware solu- 
tions to improve tape utilization, such as the International 
Business Machine (IBM) Corporation's Magstar Virtual 

5 Tape Server. The Virtual Tape Server employs a cache of 
DASD devices which appear as tape devices to the user. 
Data is backed up from the virtual tapes, i.e., the DASD 
devices, to the tape library in a manner that maximizes tape 
capacity utilization. 

10 The IBM Data Facility Storage Management Subsystem 
(DFSMS) implemented in the IBM Multiple Virtual Storage 
(MVS) operating system provides two techniques for 
increasing tape capacity utilization. The Tape Mount Man- 
agement (TMM) procedure of DFSMS involves routing data 

15 sets to a DASD pool, called the buffer. The DFSMS software 
checks the DASD pool and automatically migrates files from 
the DASD pool to tapes. DFSMS further includes a recy- 
cling operation. When a tape is taken away during a migra- 
tion or recycle operation by a recall request, the tape is 

20 marked as full. Subsequently, the tapes marked as full are 
gathered and the data in the tapes is recycled into a smaller 
set of tapes. For instance, if two tapes marked as full are 10% 
and 25% occupied with data, the DFSMS program will 
merge the data contents from these tapes into a single tape 

25 that is 35% filled. 

The recycling process takes tape resources off-line from 
normal HSM operations. At some point, at least two tape 
drives must be set aside to merge multiple input tapes into 
a single output tape. A table of tapes marked as full, 

30 containing both valid and invalid data sets, is built. Data sets 
are invalidated over time by the expiration or subsequent 
recall of migrated data. The valid data sets from the full 
tapes are then merged into an output tape. Those filled tapes 
containing the least amount of valid data are typically 

35 recycled first. Moreover, recycling operations take place at 
predetermined intervals. Between these recycle periods, 
numerous tapes could become partially filled, marked as 
filled, and set aside. 

Relying on recycling to increase tape capacity utilization 

40 requires that tape drives be taken away from regular input/ 
output (I/O) operations and dedicated to recycling opera- 
tions. Further, additional processing power must be taken 
off-line to handle the recycling. Moreover, recycling does 
nothing to limit the continually expanding number of par- 

4 * dally filled tapes not marked as full during normal opera- 
tions. 

SUMMARY OF THE PREFERRED 
EMBODIMENTS 

50 To overcome the limitations in the prior art described 
above, the present invention discloses a system for increas- 
ing capacity utilization of non -volatile storage devices 
within a group of non-volatile storage devices used to store 
data from at least one attached system. A group of data sets 

55 is written to a first storage device as part of a write operation. 
A plurality of storage devices partially filled with data are 
designated as substitutes. The write operation to the first 
storage device is suspended upon receiving a request to read 
a data set stored in the first storage device. A second storage 

60 device is then selected from the plurality of substitute 
storage devices. The write operation is continued by writing 
data sets from the group of data sets included in the write 
operation that were not written to the first storage device to 
the selected second storage device. The requested data is 

65 then read from the first storage device. 

In further embodiments, after data has been read from the 
first storage device, the first storage device is designated as 



07/26/2004, EAST Version: 1.4.1 



us 6,r 

3 

a substitute storage device. In this way, the partially filled 
first storage device may be selected for continuing write 
operations. 

In still further embodiments, the substitute storage 
devices can be used as input during recycling operations, 
wherein data sets from substitute storage devices arc trans- 
ferred to an output storage device. If an output storage 
device is needed to complete recycling operations, one of the 
substitute storage devices can be used as output in recycling 
operations to further improve data capacity utilization. 

The preferred embodiments improve storage capacity 
utilization by using partially filled storage devices in write 
operations. This insures that partially filled storage devices 
will be filled with data writes. Moreover, when a storage 
device is subject to a read request, such as a recall, writing 
to the storage device is suspended. This recalled storage 
device is partially filled with data because the write opera- 
tion was likely suspended before the storage device was 
filled. After data has been read from the recalled storage 
device, the storage device is designated as a substitute which 
may then be used to complete other write operations sus- 
pended for recall. In this way, unused data storage capacity 
in substitute storage devices is utilized, thereby improving 
the capacity utilization of such storage devices. 

Yet further, the substitute tapes can be used as input 
during recycling operations to further increase capacity 
utilization. Still further, substitute tapes can be used as 
recycling output to which input data sets are transferred. 
This also increases the capacity utilization of substitute 
tapes. 

Thus, the preferred embodiments improve data capacity 
utilization by using partially filled substitute tapes in migra- 
tion and/or recycling operations to add more data sets to the 
partially filled substitute tapes, thereby improving the capac- 
ity utilization of partially filled tapes. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Referring now to the drawings in which like reference 
numbers represent corresponding parts throughout: 

FIG. 1 is a block diagram illustrating a software and 
hardware environment in which preferred embodiments of 
the present invention are implemented; 

FIG, 2a is a block diagram illustrating a data structure of 
a tape inventory table including information on available 
tape cartridges in accordance with preferred embodiments of 
the present invention; 

FIG. 2b is a block diagram illustrating a data structure of 
a substitute tape table including information on those storage 
devices available for substitute operations in accordance 
with preferred embodiments of the present invention; 

FIG. 3 illustrates a flowchart of logic that uses substitute 
tapes with migration operations in accordance with preferred 
embodiments of the present invention; 

FIG. 4 illustrates a flowchart of logic that uses substitute 
tapes with recall operations in accordance with preferred 
embodiments of the present invention; and 

FIG. 5 illustrates a flowchart of logic that uses substitute 
tapes with recycling operations in accordance with preferred 
embodiments of the present invention. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 

In the following description, reference is made to the 
accompanying drawings which form a part hereof, and 
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which show by way of illustration, several embodiments of 
the present invention. It is understood that other embodi- 
ments may be utilized and structural changes may be made 
without departing from the scope of the present invention. 

5 

Hardware and Software Environment 

FIG. 1 illustrates the hardware and software environment 
in which preferred embodiments of the present invention are 
implemented. A host system 2 includes a hierarchical stor- 
age management (HSM) program 4. Data is transferred 
between the host system 2 and secondary storage devices 
managed by a tape subsystem 6 via a network fine 8. The 
network line 8 may be comprised of any network technology 
known in the art, such as Local Area Network (LAN), Wide 
Area Network (WAN), Storage Area Network (SAN), Trans- 
mission Control Protocol/Internet Protocol (TCP/IP), the 
Internet, etc. The tape subsystem 6 includes tape drives 10a, 
bj c. Tape cartridges 12a, b t c, d may be inserted in the tape 
drive and accessed by the tape subsystem 6. The tape 
subsystem 6 further includes program logic to manage the 
tape drives 10a, b, c and the tape cartridges 12a, b, c, d. In 
alternative embodiments, the tape subsystem 6 and host 
system 2 may be located on a single computer machine. 

25 The host system 2 may be any computer system, such as 
a mainframe, personal computer, workstation, etc., includ- 
ing an operating system such as Windows, AIX, Unix, MVS, 
etc. (Windows is a registered trademark of Microsoft Cor- 
poration; AIX is a registered trademark and MVS is a 

30 trademark if IBM Corporation; and UNIX is a registered 
trademark in the United States and other countries licensed 
exclusively through The Open Group.) The HSM program 4 
in the host system 2 may include the functionality of HSM 
type programs known in the art that manage the transfer of 

35 data to a tape library, such as the IBM DFSMS implemented 
in the IBM MVS operating system. The IBM DFSMS 
software is described in "DFSMS/MVS V1R4 General 
Information," IBM document no. GC26-4900-05, published 
by IBM (Copyright 1997, IBM), which publication is incor- 

40 porated herein by reference in its entirety. In addition to 
including known HSM functions, such as recall and 
migration, the HSM program 4 would further include addi- 
tional program instructions to perform the operations of the 
preferred embodiments of the present invention. The HSM 

45 program 4 may be implemented within the operating system 
of the host system 2 or as a separate, installed application 
program. 

The tape subsystem 6 is similarly comprised of a com- 
puter system and manages a plurality of tape drives 10a, b t 

50 c and tape cartridges 12a, b, c, d. The tape drives 10a, b, c 
may be any suitable tape drives known in the art, e.g., the 
Magstar 3590 tape drives. Tape cartridges 12a, b, c, d may 
be any suitable tape cartridge device known in the art, 
(Magstar is a registered trademark of IBM Corporation) 

55 such as ECCST, Magstar, IBM 3420, 3480, 3490E, 3590 
tape cartridges, etc. Trie tape subsystem 6 may be a manual 
tape library in which the user must manually mount tape 
cartridges 12a, b, c, d into the tape drives 10a, b, c or an 
automated tape library (ATL) in which a robotic arm mounts 

60 tape cartridges 12a, 6, c, d in the library into the tape drives 
10a, b, a More or less tape drives and/or tape cartridges may 
be provided with the tape subsystem 6 than shown in FIG, 
1. 

In alternative embodiments, alternative storage media 
65 may be substituted for the tape cartridges 12a, b, c, d 
discussed above. Any type of non- volatile storage media 
could be used, including optical disks, holographic units, 
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digital video disc (DVD), compact disc-read only memory recalled tape will be taken away for the recall request. The 

(CD-ROM), non -volatile random access memory (RAM), selected substitute tape \2d will then take the place of the 

etc. recalled tape such that the transfer of data sets will continue 

The tape subsystem 6 receives commands from the HSM on the substitute tape. The HSM program 4 updates the tape 

program 4 in the host system 2 and performs the operations 5 inventory table 30 with the current status of tapes. The HSM 

requested by the HSM program 4, such as migration and program 4 may select a tape from the substitute tape table 32 

recall, to transfer data between the host system 2 and the based on a predefined criteria, such as substitute tape I2d 

components managed by the tape subsystem 6. In preferred having the least amount of available storage space, 

embodiments, the tape subsystem 6 can simultaneously A tape that was involved in a migration or recycling 

process numerous input/output requests from the host sys- 10 operation that was subject to an interrupting recall request 

tem 2 and any other attached system directed toward the tape will sometimes be partially fuD. After the recall request is 

drives 10a, b, c and tape cartridges 12a, b, c, d managed by completed, the partially filled tape is then given the "sub- 

tne tape subsystem 6. Moreover, the HSM program 4 in the stitute" status. Substitute tapes are selected to complete the 

host system 2 may be capable of multi-tasking, simulta- transfer of data sets involved in a migration or recycling 

neously executing numerous input/output operations, and 15 operation for those tapes recalled and taken away from the 

simultaneously transmitting multiple I/O requests to the tape migration or recycling operation. 

subsystem 6 to execute. To prevent the number of substitute tapes from increasing 

In further embodiments, a plurality of host systems 2 may beyond a predetermined limit, the data sets from substitute 

communicate with the tape subsystem 6 and/or a host system tapes may be recycled into a lesser number of tapes. In 

2 may communicate and transfer data to a plurality of tape 20 preferred embodiments, recycle operations will occur at 

subsystems 6, each subsystem providing access to a library predetermined intervals. The HSM program 4 may maintain 

of tape cartridges. a substitute goal, which may be defined by the user or a 

default setting. To perform the recycle operation, the HSM 

Managing Storage Devices to Improve Capacity program 4 builds two recycle queues using the tape inven- 

Utilization 25 tory mble 3() ^ fifSt queue mcludes a ^ 0 f all fi lled tapes 

In preferred embodiments, to the extent possible, data sets ordered according to the least amount of valid data. The 

are written to, i.e., stacked, on the same tape cartridge 12a, second queue includes all unassociated, partially filled tapes, 

b, c, d or group of tape cartridges. This increases the which are also the substitute tapes, ordered by the amount of 

efficiency of tape media usage and reduces the overall available space. The HSM program 4 then discards the goal 

number of tape cartridges 12a, b, c, d needed. Stacking on 30 number, e.g., 10, of substitute tapes most filled with data 

the same tape cartridge 12a, b } c, d further installs a group from the second queue. The HSM program 4 then merges the 

of related data sets together on a minimum number of tape first and second queues into an input list and orders the tapes 

cartridges 12a, b, c, d A data set collection is a group of data according to the least amount of valid data. The first recycle 

sets which are intended to be allocated on the same tape 35 queue typically excludes tapes involved in ongoing migra- 

cartridge 12a, b } c, d as a result of data set stacking. ticm and recycling. 

In preferred embodiments, the HSM program 4 manages The recycling operation requires at least two tape drives, 

at least two tables, illustrated in FIGS. 2a and 2b that are An input tape drive is provided to mount all the tapes in the 

maintained in the host system 2 memory. FIG. 2a shows a input queue. In preferred embodiments, the HSM program 4 

tape inventory table 30 including information on all the tape 40 selects from the tape inventory table 30 a tape associated 

cartridges 12a, b, c, d, e managed by the tape subsystem 6. with recycling or a substitute tape as the recycle output tape. 

As discussed, more tape cartridges, e.g., 12e, may be pro- In preferred embodiments, the tapes filled with the most 

vided than shown in FIG. 1. The "Tape No." identifies the valid data, i.e., the fullest, are selected first as recycling 

tape and its location in the tape subsystem 6, i.e., where it is output, whereas the input substitute tapes are those with the 

located in the library. The "Status" information indicates the 45 least amount of valid data. 

tape cartridges status, i.e., involved in migration, recall or With this system, during real time operations, partially 

recycling operations. Indication of an "unassociated" status filled tapes placed in the substitute pool have their capacity 

signifies that the tape is not involved in a current operation. utilized during migration and recycling activities by substi- 

A tape 126 associated with migration is involved in a tuting in for tapes taken away for a recall operation. Such 

migration operation, a tape 12a associated with recall is the 50 partially filled, unassociated, substitute tapes will also be 

subject of a recall request, and a tape 12e associated with used in recycle operations, thereby further increasing the 

recycle is associated with recycling operations. capacity utilization of partially filled tapes at the recycle end. 

FIG. 2b shows a substitute tape table 32 indicating those This preferred system further allows for the immediate 

tapes available as substitutes. The substitute tape table 32 servicing of a recall request and for the continued migration 

further provides the amount of free storage space available 55 or recycling of data to the substitute tape. Moreover, using 

on the substitute tape cartridges 12c, d. The HSM program the substitution process to increase the capacity utilization of 

4 may readily access the substitute tape table 32 to determine partially filled tapes during normal operations such as 

which tape to substitute for a tape associated with a migra- migration, also reduces the need to take away tape drives 

tion or recycling activity taken away to handle a recall 10a, b, c for recycle operations in which data sets from tapes 

operation. 60 arc merged into a fewer number of tapes. Reducing the need 

The HSM program 4 processes the information in the for recycling operations conserves system resources and 

substitute tape table 32 to select a tape to substitute for a tape minimizes the need to take tape drives 10a, b, c off-line for 

taken away for a recall operation. If a recall request is made SUCD recycling operations. 

to a tape involved in migration or recycling, a record will In the preferred embodiment, a tape load management 

indicate such recall request. After a complete data set has 65 policy is implemented by using the most filled substitute 

been transferred during migration or recycling to the tape, tape to fill-in for migration and recycling operations, and 

the transfer of further data sets will be suspended and the using the least filled substitute tape is used as recycling input 
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is load controlled. On the substitute end, the most filled tapes 12b. Control transfers to block 48 which represents the state 
are filled up. On the recycling end, tapes having the least where tape 12b is available to support recall. At this time, the 
amount of data are merged. This load control balancing HSM program 4 may update the tape inventory table 30 
insures that tapes at both ends of the spectrum, i.e., from the indicating that the status of tape 12b is now recall. The HSM 
most filled to the least filled with data, will be involved in 5 program 4 may then execute a recall thread to start recalling 
capacity utilization operations to improve data capacity. me requested data set from the tape 12b. 
Nonetheless, those skilled in the art will appreciate that Control transfers to block 50 which represents the HSM 
alternative load management policies may be implemented program 4 determining whether there are any tapes in the 
with respect to selecting substitute tapes to use in migration substitute tape table 32. If so, control transfers to block 52; 
and recycling operations. Alternatively, if recycling is not 10 otherwise, control transfers to block 54. If the substitute tape 
used to balance the load control, the loading of substitute teble 32 15 em P tv > c 001 ™ 1 Proceeds to block 54 to request an 
tapes during migration may be varied to provide load control em P tv ta P e 10 use 10 complete the migration process. If the 
balancing. For instance, a first set of substitutes based on substitute tape table 32 is not empty, then at block 52, the 
those that are most filled with data followed by a second set HSM program 4 selects the substitute tape having the least 
of substitutes based on those least filled with data can be is amount of unused stora S e s P ace > i e - tne fullest. Control 
consecutively used to substitute in during migration, thereby ^en proceeds to block 58 which represents the HSM pro- 
balancing the load control, i.e., insuring that the capacity of 8 ram 4 removing the selected tape from the substitute tape 
all partially filled tapes, those with the least and most table 32. Control then proceeds to block 56 to start migrating 
amounts of valid data, are utilized. data sets to the selected substitute tape 12d or the requested 
The preferred embodiments may be implemented as a 20 * block 56, the HSM program may further 

method, apparatus or article of manufacture using standard mdlcat f m A the ta ? e inventor y lable 32 lhal Mtd 

• ./ ••*!_* j is involved in migration, 

programming and/or engineering techniques to produce & 

software, firmware, hardware, or any combination thereof. Id Preferred embodiments involving an Automated Tape 

The term "article of manufacture" (or alternatively, "com- L*™* thc HSM P ro 8 ram 4 wJl cause the ta P« 

puter program product") as used herein is intended to 25 subsystem 6 to mount the selected tape into a tape drive 10a, 

encompass one or more computer programs and data files b > c * In lhis embodiment, substitute tapes are not mounted 

accessible from one or more computer-readable devices, and must be mounted before dala *** can be migrated, 

carriers, or media, such as a magnetic storage media, "floppy However, those skilled in the art will appreciate that if there 

disk," CD-ROM, a file server providing access to the are a sufficient number of tape drives or unused tape drives, 

programs via a network transmission line, holographic unit, 30 then some substitute tapes may remain mounted in tape 

etc. Of course, those skilled in the art will recognize that dnves - In such sv ;J tcm ' there would be no need to mount the 

many modifications may be made to this configuration to P c oncc selected. 

without departing from the scope of the present invention. FIG - 4 illustrates preferred logic executed by the HSM 

program 4 upon completion of a recall when substitute tapes 

Logic to Use Substitute Storage Devices During 3S are mvolved . At block 60, the logic is executed when a recall 

HSM Operations nas completed with respect to a tape. Control proceeds to 

FIGS. 3, 4, and 5 are flowcharts illustrating logic included block 62 which represents the HSM program 4 determining 

in the HSM program 4 that implements the substitution whether the tape released from the completed recall process 

capabilities of the preferred embodiments within migration, has unused capacity. A flag bit may indicate that a tape has 

recall, and recycling operations, respectively. In preferred 40 no unused capacity and is full. If there is unused capacity, 

embodiments, the HSM program 4 is a multi- tasking pro- i.e., the tape is not full, then control transfers to block 64 

gram and the operating system of the host system 2 supports which represents the HSM program 4 adding the released 

multi-tasking. In such case, the HSM program 4 may tape to the substitute table 32. At this point, the tape 

simultaneously execute threads implementing the logic of subsystem 6 may demount or dismount the released tape 

FIGS. 3, 4, 5. Once the HSM program 4 has completed 45 cartridge. If there is no unused data capacity, then at block 

executing the logic of FIGS. 3, 4, 5, the HSM program 4 66, the HSM program 4 may cause the tape subsystem 6 to 

would continue executing other threads concurrently being dismount or dismount the full tape and update the tape 

executed and begin executing further threads. Those skilled inventory table 32 to indicate that the tape is not in use. The 

in the art will recognize that this logic is provided for full tape may be filled with valid as well as invalid data sets, 

illustrative purposes only and that different logic may be 50 FIG. 5 illustrates preferred recycle logic when substitute 

used to accomplish the same results. Moreover, the logic tapes are involved in the recycle operation. Control begins 

order shown in FIGS. 3, 4, 5 may be performed in an order at block 70 when the HSM program 4 initiates recycling 

other than shown in the figures. operations. The HSM program 4 may initiate recycling 

Control begins at block 40 when during a migration operations at predetermined intervals, e.g., the end of the 

operation, a complete data set has migrated to the tape 55 day. Control proceeds to block 72 which represents the HSM 

cartridge, e.g., tape cartridge 12b. Control transfers to block program 4 processing the data in the tape inventory table 30 

42 which represents the HSM program 4 determining to build a first queue data structure including filled tapes. In 

whether a recall request, executed in another thread, was preferred embodiments, the HSM program 4 orders the filled 

made needing the tape cartridge 12b during the migration of tapes according to the amount of valid data, those with the 

the completed data set. If so, control transfers to block 44; 60 least amount of valid data at the top of the queue. As 

otherwise, control transfers to block 46. Block 46 represents mentioned, filled tapes can include valid as well as invali- 

the HSM program 4 migrating the next data set included in dated data, i.e., migrated data that has expired over time. In 

the migration operation to the tape cartridge 12b. If there are alternative embodiments, the HSM program 4 may order the 

further data sets in the migration operation to transfer, the filled tapes according to alternative schemes. Control trans- 

HSM program 4 would again execute the logic of FIG. 3, 65 fers to block 74 which represents the HSM program 4 

starting at block 40. Block 44 represents the HSM program processing the information in the tape inventory table 30 to 

4 suspending the migration of further data sets to the tape build a second queue data structure comprised of the unas- 
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sociated partially filled tapes. Unassociated, partially filled take away the tape involved in recycling to recall, and 

tapes are substitute tapes. provide a substitute tape on which to complete transferring 

The HSM program 4 orders the second queue of substitute toe da * a fr <> m tne tapes from the recycle input list. In this 

tapes by the amount of data stored in the substitute tapes, w *y> ^e logic of FIGS. 3, 4, 5 insures that the capacity of 

placing the least filled tapes at the front of the queue. Control 5 partially filled tapes is utilized during normal HSM opera- 

then proceeds to block 76 which represents the HSM pro- ^ons such as migration and recycle, 

gram 4 excluding from the second queue the goal number of In embodiments where there are multiple tape subsystems 

substitute tapes most filled with data, i.e., the least amount 6, i.e., ATL libraries, the tape inventory table 30 provides 

of available storage space. For instance, if the HSM program information on all tapes throughout all the ATL libraries, 

4 maintains a goal of ten substitute tapes, then the HSM 10 including information location. The HSM program, 4 is 

program 4 would exclude from the second queue data capable of controlling all attached ATL tape libraries. A 

structure the ten most filled substitute tapes. This step multiple library system with pass-through capability allows 

consolidates the set of substitute tapes to no less than the a tape to be mechanically passed from one library to another, 

preset goal number. Control then transfers to block 78 which This allows the multiple libraries to function as one single 

represents the HSM program 4 forming a recycle input list 15 library with the combined tape drives and cartridges of all 

by merging the tapes listed in the first and second queues and libraries managed as a single unit. 

sorting the list according to those least filled with valid data. \ n certain multiple tape library embodiments, pass- 

The tapes in the input list least filled with data are at the through capability may not be feasible given the substantial 

beginning of the input list and first selected as input for tim e required to mechanically pass tapes between libraries, 

recycling operations, 20 i n selecting a substitute tape in ATL environments lacking 

Control transfers to block 80 which is a decision block pass-through capabilities, the HSM program 4 will have to 

representing the HSM program 4 determining whether there determine whether the selected substitute tape is located in 

is a tape in the tape inventory table 30 associated with the same library as the tape drive allocated to the task for 

recycling operations. If so, control transfers to block 82, which the substitute is needed, i.e., the migration or recycle 

which represents the HSM program 4 selecting the recycle 25 operation. For instance, at block 52 in FIG. 3, after selecting 

associated tape, e.g., tape cartridge 12e, as the recycle output a substitute, the HSM program 4 would determine whether 

tape. If there are no tapes associated with recycle, then the selected substitute tape is located in the library including 

control transfers to block 84 which represents the HSM the tape drive allocated to the migration or recycling task for 

program 4 determining whether there are any tapes in the which the substitute tape is selected. If so, the selected 

tape substitute table 32. If there are no substitute tapes, then 30 substitute tape is used to complete the migration or recycle 

control transfers to block 86 which represents the HSM task interrupted by the recall. However, if the selected 

program 4 requesting an empty tape cartridge as the recycle substitute tape is in a different library from the library 

output tape. If there are substitute tapes, then control trans- including the tape drive allocated to the migration or recy- 

fers to block 88 which represents the HSM program 4 cling operation requiring the substitute tape, then the HSM 

processing the information in the tape substitute table 32 to 35 program 4 selects the next substitute tape from the tape 

select the substitute tape most filled with data as the recycle substitute list 32 that is most filled with data. In preferred 

output tape. embodiments, the selected substitute tape located in a library 

From block 88, control transfers to block 90 which not including the allocated tape drive is added back to the 

represents the HSM program 4 removing the selected sub- substitute tape table 32. Alternatively, the substitute tape is 

stitute tape from the tape substitute table 32. Once a tape is not removed from the substitute tape table 32 unless it is 

selected or requested at blocks 82, 86, and 90, control used m a substitute operation. This determination and selec- 

proceeds to block 92 which represents the HSM program 4 tion continues until the HSM program 4 locates a substitute 

mounting the selected or requested recycle output tape and to pe that resides in the library also including the allocated 

writing the data sets from the tapes listed in the input list to 4S tape drive. 

the mounted output tape. Control transfers to block 94 which In multi-library environments including pass through 

represents the HSM program 4 indicating in the tape inven- capabilities, if the HSM program 4 selects a substitute tape 

tory table 30 that the output tape is associated with recycling that is in a library not including the allocated tape drive, then 

operations. the HSM program 4 can cause the tape subsystem 6 includ- 

In preferred embodiments, the selected target/output tape 50 in S the selected substitute tape, and all intermediary 

will be mounted in one tape drive, and another tape drive libraries, to mechanically pass the selected tape to the library 

will be dedicated for the input tapes. Tapes will be selected including the allocated tape drive, 

from the input list in order and mounted in the input tape Conclusion 
drive. Once all the data sets are copied from an input tape to 

the output tape, the input tape is demounted, and the next 55 This concludes the description of the preferred embodi- 

input tape in the input list is mounted into the input tape ments of the invention. The following describes some alter- 

drive. In alternative embodiments, different combinations of native embodiments for accomplishing the present inven- 

tape drives may be used in the recycling operations. For tion. 

instance, multiple input tapes may be mounted to provide Preferred embodiments were described with respect to 

continuous recycling of tapes from the input list in order to 60 magnetic tapes. However, the discussed preferred embodi- 

avoid the downtime resulting from dismounting and mount- ments could be used in the management of any type of 

ing input tapes from a single input drive. non- volatile storage unit providing backup capacity for low 

If a tape involved in recycling operations is subject to a activity data, e.g., optical disks, holographic units, DVD, 

recall request, then the HSM program 4 will record such a CD-ROM, non-volatile RAM, etc. 

request. After a data set has been transferred to the output 65 In preferred embodiments, the control logic of FIGS. 3, 4, 

recycle tape during recycling operations, the HSM program 5 was implemented in the HSM program 4 maintained in the 

4 may execute the logic of FIG. 3 beginning at block 42 to host system 2. The host system 2 generates commands to 
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cause the tape subsystem 6 to perform the desired input/ 
output operation with respect to tape cartridges 12a, b, c, d 
managed by the tape subsystem 6. However, those skilled in 
the art will appreciate that some portions of the logic 
described with respect to FIGS. 3, 4, 5 could be imple- 5 
mealed in locations other than the host system 2, such as 
within the tape subsystem 6. Moreover, the operations and 
components described with respect to the host system 2, tape 
subsystem 6, and HSM program 4 may be implemented in 
a single computer machine or distributed across a plurality 
of computer machines. 

In the preferred logic of FIGS. 3, 4, 5, substitute tapes 
were not mounted. In an automated tape library (ATL), the 
HSM program 4 could control a robotic arm to access and 
mount tape cartridges from the tape library. Alternatively, in 
a manual mounting system, the HSM program 4 could alert 15 
the user with a message to dismount a tape and mount a 
specific tape from the tape library in place of the removed 
tape. The HSM program 4 would maintain a tape library 
table including information on all tapes within the tape 
library and their current capacity. 20 

In summary, preferred embodiments in accordance with 
the present invention provide a system for increasing capac- 
ity utilization of non-volatile storage devices within a group 
of non-volatile storage devices used to store data from at ^ 
least one attached system. A group of data sets is written to 
a first storage device as part of a write operation. A plurality 
of storage devices partially filled with data are designated as 
substitutes. The write operation to the first storage device is 
suspended upon receiving a request to read a data set stored 3Q 
in the first storage device. A second storage device is then 
selected from the plurality of substitute storage devices. The 
write operation is continued by writing data sets from the 
group of data sets included in the write operation, but were 
not written to the first storage device, to the selected second 35 
storage device. The requested data is then read from the first 
storage device. 

The foregoing description of the preferred embodiments 
of the invention has been presented for the purposes of 
illustration and description. It is not intended to be exhaus- ^ 
tive or to limit the invention to the precise form disclosed. 
Many modifications and variations are possible in light of 
the above teaching. It is intended that the scope of the 
invention be limited not by this detailed description, but 
rather by the claims appended hereto. The above 45 
specification, examples and data provide a complete descrip- 
tion of the manufacture and use of the composition of the 
invention. Since many embodiments of the invention can be 
made without departing from the spirit and scope of the 
invention, the invention resides in the claims hereinafter 
appended. 

What is claimed is: 

1. A method for increasing capacity utilization of non- 
volatile storage devices within a group of non-volatile 
storage devices used to store data from at least one attached 55 
system, comprising the steps of: 

(a) writing a group of data sets to a first storage device as 
part of a write operation; 

(b) designating a plurality of storage devices partially 
filled with data as substitute storage devices; 60 

(c) receiving a request to read a data set stored in the first 
storage device while the write operation is continuing; 
and 

(d) after receiving the read request directed toward the 
first storage device, performing the steps of: 65 
(i) suspending the write operation to the first storage 

device; 



50 



(ii) selecting a second storage device from the plurality 
of substitute storage devices; 

(iii) continuing the write operation by writing data sets 
from the group of data sets included in the write 
operation that were not written to the first storage 
device to the selected second storage device; and 

(iv) reading the requested data set from the first storage 
device. 

2. The method of claim 1, further comprising the steps of: 
designating the first storage device a substitute storage 

device after reading the requested data from the first 

storage device; 
writing an additional group of data sets to a third storage 

device as part of an additional write operation; 
receiving an additional request to read a data set stored in 

the third storage device; and 
after receiving the additional read request, performing the 

steps of: 

(i) suspending the additional write operation to the third 
storage device; 

(ii) selecting the first storage device from the plurality 
of substitute storage devices; 

(iii) continuing the write operation by writing data sets 
from the additional group of data sets included in the 
additional write operation that were not written to the 
third storage device to the selected first storage 
device; and 

(iv) reading the requested data set from the third 
storage device as part of the additional read request. 

3. The method of claim 1, wherein the step of selecting a 
second storage device further comprises the steps of: 

locating a substitute storage device among the plurality of 
substitute storage devices that has a least amount of 
unused data storage space; and 

selecting the located substitute storage device. 

4. The method of claim 1, further comprising the steps of: 
selecting a group of substitute storage devices as input to 

a recycling operation; and 
transfering data sets from the group of substitute storage 
devices to at least one storage device designated as an 
output storage devices to the recycling operation. 

5. The method of claim 4, wherein the step of selecting a 
group of substitute storage devices comprises the steps of: 

selecting a goal value of substitute storage devices; and 
excluding from the group of substitute storage devices a 
number of substitute storage devices equal to the goal 
value, wherein the excluded substitute storage devices 
include a least amount of unused storage space among 
all substitute storage devices. 

6. The method of claim 4, wherein the output storage 
device is a substitute storage devices not included in the 
group of substitute storage device. 

7. The method of claim 1, wherein a plurality of storage 
libraries each include a plurality of storage devices, wherein 
a storage drive in one of the libraries is allocated to the write 
operation, wherein the step of selecting a second storage 
device from the plurality of substitute storage devices com- 
prises the steps of: 

selecting a second storage device from the plurality of 

substitute storage devices; 
determining whether the selected second storage device is 

located in the storage library including the allocated 

storage drive; and 
selecting another substitute storage device as the second 

storage device upon determining that the selected sec- 
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ood storage device is not located in the storage library 
including the allocated storage drive. 

8. A storage management system for increasing capacity 
utilization, comprising: 

(a) a storage subsystem; 5 

(b) a plurality of storage devices, including a plurality of 
storage devices designated as substitutes, wherein the 
storage subsystem manages the storage devices; 

(c) a host system connected to the storage subsystem and 

in data communication with the storage devices; and 10 

(d) program means for: 

(i) controlling a write operation in which a group of 
data sets are written from the host system to a first 
storage device; 

(ii) processing a request to read a data set stored in the 15 
first storage device while the write operation is 
continuing; 

(iii) suspending the write operation to the first storage 
device after processing the read request; 

(iv) selecting a second storage device from the plurality 

of substitute storage devices after suspending the 20 
write operation; 

(v) continuing the write operation by writing data sets 
from the group of data sets included in the write 
operation that were not written to the first storage 
device to the selected second storage device; and 25 

(vi) reading the requested data set from the first storage 
device. 

9. The system of claim 8, wherein the program means is 
further capable of designating the first storage device as a 
substitute storage device after the host system has read data 30 
from the first storage device. 

10. The system of claim 8 wherein each storage device 
includes a storage medium that is a member of the set of 
media comprising magnetic tape, optical disks, Compact 
Disc-Read-Only-Memory (CD-ROM), holographic storage 35 
units, and non-volatile random access memory. 

11. The system of claim 8, wherein the storage manage- 
ment system including: 

a plurality of storage subsystems, wherein each storage 
subsystem includes a plurality of storage devices and at ^ 
least one storage drive into which one storage device is 
capable of being inserted in order to perform read/write 
operations with respect to the storage device, and 

an allocated storage drive included in one of the storage 
subsystems allocated to the write operation, and 45 
wherein the program means is further capable of deter- 
mining whether a selected substitute storage device is 
located in the storage subsystem including one allo- 
cated storage drive and selecting another substitute 
storage device as the substitute storage device upon 50 
determining that the selected storage device is not 
located in the storage subsystem including the allocated 
storage drive. 

12. An article of manufacture for use in programming a 
computer system to control a plurality of non-volatile stor- 55 
age devices that store data, wherein the computer system is 

in data communication with the storage devices, the article 
of manufacture comprising a computer readable storage 
device including a computer program embedded therein that 
causes the computer system to perform the steps of: 60 

(a) causing the writing of a group of data sets to a first 
storage device as part of a write operation; 

(b) designating a plurality of storage devices partially 
filled with data as substitute storage devices; 

(c) processing a request to read a data set stored in the first 65 
storage device while the write operation is continuing; 
and 
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(d) after processing the read request directed toward the 
first storage device, performing the steps of: 

(i) suspending the write operation to the first storage 
device; 

(ii) selecting a second storage device from the plurality 
of substitute storage devices; 

(iii) continuing the write operation by causing the 
writing of data sets from the group of data sets 
included in the write operation that were not written 
to the first storage device to the selected second 
storage device; and 

(iv) reading the requested data set from the first storage 
device. 

13. The article of manufacture of claim 12, wherein the 
computer program included in the computer readable stor- 
age device further causes the computer system to perform 
the steps of: 

designating the first storage device a substitute storage 

device after the requested data has been read from the 

first storage device; 
causing the writing of an additional group of data sets to 

a third storage device as part of an additional write 

operation; 

processing an additional request to read a data set stored 

in the third storage device; and 
after processing the additional read request, performing 

the steps of: 

(i) suspending the additional write operation to the third 
storage device; 

(ii) selecting the first storage device from the plurality 
of substitute storage devices; 

(iii) continuing the write operation by causing the 
writing of data sets from the additional group of data 
sets included in the additional write operation that 
were not written to the third storage device to the 
selected first storage device; and 

(iv) reading the data set from the third storage device as 
part of the additional read request. 

14. The article of manufacture of claim 12, wherein when 
the computer program causes the computer system to per- 
form the step of selecting a second storage device, the 
computer program causes the computer system to perform 
the steps of: 

locating a substitute storage device among the plurality of 
substitute storage devices that has a least amount 
unused storage space; and 

selecting the located substitute storage device. 

15. The article of manufacture of claim 12, wherein the 
computer program included in the computer readable stor- 
age device further causes the computer system to perform 
the steps of: 

selecting a group of substitute storage devices as input to 
a recycling operation; and 

transferring data sets from the group of substitute storage 
devices to at least one storage device designated as an 
output storage device to the recycling operation. 

16. The article of manufacture of claim 15, wherein when 
the computer program causes the computer system to per- 
form the step of selecting a group of substitute storage 
devices, the computer program causes the computer system 
to perform the steps of: 

selecting a goal value of substitute storage devices; and 
excluding from the group of substitute storage devices a 
number of substitute storage devices equal to the goal 
value, wherein the excluded substitute storage devices 
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include a least amount of unused storage space among 
all substitute storage devices. 

17. Tbe article of manufacture of claim 15, wherein the 
output storage device is a substitute storage device not 
included in the group of substitute storage devices. 

18. The article of manufacture of claim 12, wherein a 
plurality of storage libraries each include a plurality of 
storage devices, wherein a storage drive in one of the 
libraries is allocated to the write operation, wherein when 
the computer program causes the computer system to per- 
form the step of selecting a second storage device from the 
plurality of substitute storage devices, the computer program 
causes the computer system to perform the steps of: 

selecting a second storage device from the plurality of 
substitute storage devices; 

determining whether the selected second storage device is 
located in the storage library including the allocated 
storage drive; and 

selecting another substitute storage device as the second 
storage device upon determining that the selected sec- 
ond storage device is not located in the storage library 
including the allocated storage drive. 

19. A computer readable storage medium for storing data 
accessible by at least one storage management program 
being executed on a computer system, wherein the computer 
system is capable of controlling a plurality of non-volatile 
storage devices, comprising: 

a data structure stored in the computer readable storage 
medium, the data structure including information used 
by the storage management program and including: 

(i) a plurality of data objects, wherein each data object 
identifies a storage device; 

(ii) status information for each data object, wherein at 
least one data object includes status information 
identifying the storage device represented by the data 
object as a substitute storage device, wherein the 
storage management program processes the data 
objects and status information therein to select a 
second storage device from substitute storage 
devices after processing a read request directed 
toward a first storage device while a group of data 
sets are being written to a first storage device as part 
of a write operation, and wherein the storage man- 
agement program suspends writing of the group of 
data sets to the first storage device and continues the 
write operation by directing the group of data sets 
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not written to the first storage device to the selected 
second storage device; and 
(iii) location information for each data object, wherein 
the storage management program processes the loca- 
5 lion information to determine the location of the 

selected second storage device represented by the 
data object. 

20. Tbe computer readable storage medium of claim 19, 
wherein after the requested data is read from the first storage 

10 device, the status information of the data object representing 
the first storage device is modified to indicate that the first 
storage device is a substitute storage device, and wherein the 
storage management program is capable of selecting the first 

15 storage device from the substitute storage devices to com- 
plete writing a group of data sets included in a suspended 
write operation. 

21. The computer readable storage medium of claim 19, 
wherein the data structure further comprises capacity utili- 

20 zation information for each data object indicating the 
amount of unused data storage space on the storage device 
represented by the data object, wherein the storage manage- 
ment program processes the capacity utilization information 
of each substitute storage device to select the substitute 

25 storage device having a least amount of unused data storage 
space. 

22. The computer readable storage medium of claim 21, 
wherein the location information in the data structure indi- 
cates a library from a plurality of libraries in which the 

30 storage device represented by a data object is located, 
wherein a storage drive is allocated to the write operation, 
and wherein the step of locating a substitute storage device 
comprises the step of the storage management program 
processing data in the data structure to locate a substitute 

35 storage device that satisfies a predefined criteria and that is 
located in a library including the allocated storage drive. 

23. The computer readable storage medium of claim 22, 
wherein the predefined criteria is the substitute storage 
device having a least amount of unused data storage space, 

40 wherein the at least one storage management program 
processes the capacity utilization information and the loca- 
tion information in the data structure to locate the substitute 
storage device that has the least amount of unused data 
storage space and that is located in a library including the 

45 allocated storage drive. 

+ + * + * 
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